Skip to content

Stabilize -Cmin-function-alignment #142824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

folkertdev
Copy link
Contributor

tracking issue: #82232
split out from: #140261

Request for Stabilization

Summary

The -Cmin-function-alignment=<align> flag specifies the minimum alignment of functions for which code is generated.
The align value must be a power of 2, other values are rejected.

Note that -Zbuild-std (or similar) is required to apply this minimum alignment to standard library functions.
By default, these functions come precompiled and their alignments won't respect the min-function-alignment flag.

This flag is equivalent to:

  • -fmin-function-alignment for GCC
  • -falign-functions for Clang

The specified alignment is a minimum. A higher alignment can be specified for specific functions by annotating the function with a #[align(<align>)] attribute.
The attribute's value is ignored when it is lower than the value passed to min-function-alignment.

There are two additional edge cases for this flag:

  • targets have a minimum alignment for functions (e.g. on x86_64 the lowest that LLVM generates is 16 bytes).
    A min-function-alignment value lower than the target's minimum has no effect.
  • the maximum alignment supported by this flag is 8192. Trying to set a higher value results in an error.

Testing

History

The -Zmin-function-alignment flag was requested by rust-for-linux #128830. It will be used soon (see #t-compiler/help > ✔ Alignment for function addresses).

Miri supports function alignment since #140072. In const-eval there is no way to observe the address of a function pointer, so no special attention is needed there (see #t-compiler/const-eval > function address alignment).

Originally, the maximum allowed alignment was 1 << 29, because this is the highest value the LLVM API accepts. However, on COFF the highest supported alignment is only 8192 (see #142638). Practically speaking, that seems more than sufficient for all known use cases. So for simplicity, for now, we limit the alignment to 8192. The value can be increased on platforms that support it if the need arises.


r? @workingjubilee

the first commit can be split out if that is more convenient.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 21, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 21, 2025

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

The Miri subtree was changed

cc @rust-lang/miri

// alignment that works on all target platforms. COFF does not support higher alignments.
if bytes > 8192 {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it be limited on all platforms? Can't we error when the alignment exceeds the maximum that the actual target we are compiling for supports? Maybe someone genuinely needs to align to 16k on ELF for whatever reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, so as we've discovered, alignment is just a mess between clang and llvm. For instance -falign-function only goes up to 1 << 16, although you can manually align a function to a much higher alignment.

For 8192, we know it'll work everywhere, and the logic for when to accept/reject an alignment value is clear.

This flag aligns all functions to the minimum, so I have a hard time seeing a realistic scenario where aligning all functions to 16k is a reasonable thing to do (the limit on individual functions will be handled separately).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the options are:

  1. Limit to 8192 on all platforms, it's consistent and unlikely to cause issues. The limit can be safely raised in the future if a need arises
  2. Follow clang and allow 1 << 16, except when the target object format is COFF, then the limit is 8192.
  3. Accept what llvm accepts: allow 1 << 29, except when the target object format is COFF, then the limit is 8192.

I've picked the most conservative one (again, with the option to relax the limits if the need ever arises), but if there is consensus now on one of the other options that's also fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my understanding that on Linux kernels we still nominally support, overaligning beyond the page size, typically 4096, is not going to work properly, so it is arguable that even 8192 is too high.

Copy link
Member

@jieyouxu jieyouxu Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO for the long term, hard-coding to 8192 across all platforms seem like a mistake -- especially because then, does this become part of the language or even more problematic, become part of (compiler/language) stability guarantees?

EDIT: or rather, what is a suitable upper bound in this case in terms of stability guarantees? As in, if we stabilize n=8192 as a universal upper bound, if we find out some platform cannot handle that in the future?

let fn_align = self.tcx.codegen_fn_attrs(instance.def_id()).alignment;
let global_align = self.tcx.sess.opts.unstable_opts.min_function_alignment;
let global_align = self.tcx.sess.opts.cg.min_function_alignment;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(orthogonal to this PR)
it is not great that the logic for merging the per-fn alignment and the global alignment needs to be repeated in each backend.

Maybe codegen_fn_attrs should just take min_function_alignment into account?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, using sess.opts in the interpreter is at best iffy since it means the interpreter can behave differently in different crates in the same crate graph which can cause unsoundness...

@traviscross
Copy link
Contributor

@rustbot labels +T-lang

If the idea with this is that passing -Cmin-function-alignment represents a language-level guarantee, e.g. that a naked function could validly rely on this minimum alignment holding when this flag is passed to prove the soundness of its operations -- that from a language spec standpoint, this is equivalent to marking each function with #[align(..)] -- then I'd suggest that a dual FCP with lang here is proper.

cc @rust-lang/lang

@rustbot rustbot added the T-lang Relevant to the language team label Jun 22, 2025
@folkertdev
Copy link
Contributor Author

Yeah that sounds right. There is also a bunch of overlap with rfc 3806.

Specifically, it contains a section that attempts to define what alignment values are accepted. #[align] and -Cmin-function-alignment should accept/reject the same values.

@traviscross
Copy link
Contributor

@rust-lang/spec: What thoughts do we have about how to handle compiler flags that affect the language definition?

cc @RalfJung

@traviscross traviscross added needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). S-waiting-on-documentation Status: Waiting on approved PRs to documentation before merging labels Jun 22, 2025
@traviscross
Copy link
Contributor

traviscross commented Jun 22, 2025

Here's a question. Since this does have language-level effect, might this be better expressed as a crate-level attribute?

If there's a good reason, given the use case, for this to be a compiler flag instead, probably it'd be good to describe that in some detail.

@ojeda
Copy link
Contributor

ojeda commented Jun 22, 2025

If it can then be applied via --crate-attr similarly, then that should be fine for our use case.

jhpratt added a commit to jhpratt/rust that referenced this pull request Jun 22, 2025
…-alignment, r=workingjubilee

centralize `-Zmin-function-alignment` logic

tracking issue: rust-lang#82232
discussed in: rust-lang#142824 (comment)

Apply the `-Zmin-function-alignment` value to the alignment field of the function attributes when those are created, so that individual backends don't need to consider it.

The one exception right now is cranelift, because it can't yet set the alignment for individual functions, but it can (and does) set the global minimum function alignment.

cc `@RalfJung` I think this is an improvement regardless, is there anything else that should be done for miri?
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jun 23, 2025
…-alignment, r=workingjubilee

centralize `-Zmin-function-alignment` logic

tracking issue: rust-lang#82232
discussed in: rust-lang#142824 (comment)

Apply the `-Zmin-function-alignment` value to the alignment field of the function attributes when those are created, so that individual backends don't need to consider it.

The one exception right now is cranelift, because it can't yet set the alignment for individual functions, but it can (and does) set the global minimum function alignment.

cc ``@RalfJung`` I think this is an improvement regardless, is there anything else that should be done for miri?
@workingjubilee
Copy link
Member

Huh. That could be... interesting to work with, since it would logically apply only to entities that originate from that crate, right?

@workingjubilee
Copy link
Member

workingjubilee commented Jun 23, 2025

@folkertdev Do any of our tests verify that if you

  • use this option on the functions described in a crate
  • one is generic or from a trait or otherwise non-instantiated
  • you instantiate that function in another crate without this flag

Then the resulting function is instantiated with the correct alignment? Where "correct" is... well, pick an answer, I guess?

rust-timer added a commit that referenced this pull request Jun 23, 2025
Rollup merge of #142854 - folkertdev:centralize-min-function-alignment, r=workingjubilee

centralize `-Zmin-function-alignment` logic

tracking issue: #82232
discussed in: #142824 (comment)

Apply the `-Zmin-function-alignment` value to the alignment field of the function attributes when those are created, so that individual backends don't need to consider it.

The one exception right now is cranelift, because it can't yet set the alignment for individual functions, but it can (and does) set the global minimum function alignment.

cc ``@RalfJung`` I think this is an improvement regardless, is there anything else that should be done for miri?
@bors
Copy link
Collaborator

bors commented Jun 23, 2025

☔ The latest upstream changes (presumably #142901) made this pull request unmergeable. Please resolve the merge conflicts.

@folkertdev
Copy link
Contributor Author

@workingjubilee we don't, currently. I think the expected answer is clear though, because -Zmin-function-alignment is applied to all compiled crates, not just the top-level one?

I'm not sure how that would work with #![min_function_alignment = N], because then library crates could set the value as well? So far I've been thinking of this option as similar to -Ctarget-cpu or -Ctarget-feature: they are not properties of a crate but of the compilation as a whole.

What thoughts do we have about how to handle compiler flags that affect the language definition?

Is setting and then relying on the alignment fundamentally different from -Ctarget-feature=+avx2 and then asserting in the program that avx2 is available?

@traviscross
Copy link
Contributor

traviscross commented Jun 23, 2025

Is setting and then relying on the alignment fundamentally different from -Ctarget-feature=+avx2 and then asserting in the program that avx2 is available?

It's about whether we like this as a safety argument:

/// SAFETY: The pointee must be a 32-byte aligned function.  It's OK
/// for the low-order bits of the function pointer itself to be
/// non-zero, e.g., if they're used for controlling the processor
/// mode.
unsafe fn f(x: fn()) { todo!() }
fn g() {
    let p = g as fn();
    // SAFETY: We compiled with `-Cmin-function-alignment=32`.
    unsafe { f(p) };
}

For avx2, I'd expect that the safety argument would be made by use of cfg(target_feature = ".."), if is_x86_feature_detected!("avx2") { .. }, etc., rather than by using the argument that it was compiled with -Ctarget-feature=+avx2 in the proof.

@jieyouxu
Copy link
Member

they are not properties of a crate but of the compilation as a whole.

Wait so, does this need to be handled like Target Modifiers?

@folkertdev folkertdev force-pushed the stabilize-min-function-alignment branch from 302017f to 2010f9a Compare June 24, 2025 12:21
@rustbot rustbot added the A-attributes Area: Attributes (`#[…]`, `#![…]`) label Jun 24, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 24, 2025

Some changes occurred in compiler/rustc_codegen_ssa/src/codegen_attrs.rs

cc @jdonszelmann

@folkertdev
Copy link
Contributor Author

Wait so, does this need to be handled like Target Modifiers?

well, two crates with a different minimum alignment can work together just fine. In fact, this is usually the case because the standard library is pre-compiled and hence may not have the same minimum alignment value. It is only when the minimum alignment is relied on in unsafe code that you'd ever run into trouble.

The goal of -Cmin-function-alignment is performance: by aligning the code, you'll likely have fewer cache misses. When the specific alignment value matters for correctness, (e.g. for pointer tagging), I've only seen #[align(4)] or similar being used so far.


For avx2, I'd expect that the safety argument would be made by use of cfg(target_feature = ".."), if is_x86_feature_detected!("avx2") { .. }, etc., rather than by using the argument that it was compiled with -Ctarget-feature=+avx2 in the proof.

Similarly for alignment I'd expect something like (func as usize).is_multiple_of(alignment) (pending a better way of getting the function address).

Btw I don't particularly like the kind of safety argument in the example, but I think it is no worse than just assuming that a target feature is enabled.

@traviscross traviscross added I-lang-nominated Nominated for discussion during a lang team meeting. P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang labels Jun 24, 2025
@tmandry
Copy link
Member

tmandry commented Jun 25, 2025

I think it's entirely valid to use a compiler flag for performance concerns like this, but then the flag should probably be considered more as a hint than a guarantee. I actually like clang's naming better for this purpose.

I think for language-level guarantees we should probably use #[align] or possibly a crate-level attribute; that's a good callout by @traviscross.

I think there are gray areas; I could see there being situations where you need to apply something to a whole crate graph and are forced to rely on either compiler flags or linker tricks. As far as Rust is concerned I like --crate-attr as a way to make it clear when you are transitioning into the realm of these guarantees. I expect there are cases where we haven't followed this distinction and we'll need to clean it up over time.

@jieyouxu
Copy link
Member

The align value must be a power of 2, other values are rejected.

Do we actually have test coverage for this? I.e. -Cmin-function-alignment=0 or
-Cmin-function-alignment=3?

A min-function-alignment value lower than the target's minimum has no effect.

Should this be a warning? If this is intended as a hint, then 🤷.

@bjorn3
Copy link
Member

bjorn3 commented Jun 25, 2025

The goal of -Cmin-function-alignment is performance: by aligning the code, you'll likely have fewer cache misses. When the specific alignment value matters for correctness, (e.g. for pointer tagging), I've only seen #[align(4)] or similar being used so far.

Doesn't LLVM already insert padding between functions for alignment by default? I believe it aligns to 16 bytes on x86_64 for example. Cranelift also does that.

@folkertdev
Copy link
Contributor Author

Do we actually have test coverage for this? I.e. -Cmin-function-alignment=0 or -Cmin-function-alignment=3?

This PR adds tests for:

//@ revisions: too-high not-power-of-2
//
//@ [too-high] compile-flags: -Cmin-function-alignment=16384
//@ [not-power-of-2] compile-flags: -Cmin-function-alignment=3

//~? ERROR a number that is a power of 2 between 1 and 8192 was expected

The implementation uses Align::from_bytes to parse the value.

Should this be a warning? If this is intended as a hint, then 🤷.

The flag is deliberately specified as a minimum. If for some external reason the alignment is already higher, the flag still works as intended: the alignment is at least the specified value.

Doesn't LLVM already insert padding between functions for alignment by default? I believe it aligns to 16 bytes on x86_64 for example. Cranelift also does that.

Yes, there is a per-platform minimum, but RfL wants to bump it higher than than minimum in some cases.

From #t-compiler/help > ✔ Alignment for function addresses:

Asked because in Linux kernel, to enable dynamic function trace, on ARM64 the function addresses are required to be 8-byte aligned: https://github.com/Rust-for-Linux/linux/blob/rust-next/arch/arm64/kernel/ftrace.c#L95

Note the dynamic check for the address actually being aligned in the linked C code.

@bjorn3
Copy link
Member

bjorn3 commented Jun 25, 2025

Yes, there is a per-platform minimum, but RfL wants to bump it higher than than minimum in some cases.

That case is not merely a performance optimization, but actually mandatory. For something mandatory a target modifier makes sense to me.

@folkertdev
Copy link
Contributor Author

It is still not unsound, though, and based on the RFC summary:

A target modifier is a flag where it may be unsound if you link together two compilation units that disagree on the flag.

But maybe the definition of when a flag is a target modifier is broader in practice? I agree that in that case you'd want to effectively guarantee all codegen units use some minimum alignment.

@scottmcm
Copy link
Member

If this is supposed to be a guarantee it's not obvious to me that a -C flag is the right way to do it.

If the goal is that this is for RFL, should RFL just have a different target with a different function alignment setting? Feels like they'd want a different target for things like their different float and vector-register-usage stuff too anyway.

With a -C flag it just brings up a whole bunch of questions to me about what's supposed to happen with, say, an inlined function that's defined in a crate compiled with -Cmin-function-alignment but actually monomorphized in a crate without that flag? Does that still give a "guarantee" about the alignment of it? If someone starts using -Z hint-mostly-unused -- especially if we autodetect it for dependencies, eventually -- does that make -Cmin-function-alignment mostly meaningless in that crate because we delay codegen to the use?

Obviously with a target flag that inconsistency can't happen, so it just feels nicer to me, and avoids a whole bunch of "well should we have a warning that you compiled a crate with a lower value that a dep?" kinds of questions.

@ojeda
Copy link
Contributor

ojeda commented Jun 25, 2025

If the goal is that this is for RFL, should RFL just have a different target with a different function alignment setting? Feels like they'd want a different target for things like their different float and vector-register-usage stuff too anyway.

If you mean adding a new built-in target, then we have avoided adding targets for things like this because there would be way too many due to combinatorial explosion and because it would tie us to new Rust releases if new values are needed in the future.

(There was a Linux kernel target years ago, and we didn't use it for those reasons. We ended up talking about "global target features" for things where we just wanted to modify the base target because we knew we would pass the same flags everywhere for a given build, and then finally we had "target modifiers" for things where the ABI really needs to match so it is checked.)

@joshtriplett
Copy link
Member

I don't know that this needs lang approval, personally. But I would say it would be useful to specify whether this option is a best-effort or a guarantee. And, if it's a guarantee, then how does it interact with inlining? In general, I would expect that if you take the address of an inline function then we'd make sure there's an entry point you can take the address of, which any guarantee could then apply to.

@RalfJung
Copy link
Member

@workingjubilee

Then the resulting function is instantiated with the correct alignment? Where "correct" is... well, pick an answer, I guess?

I think #142854 actually changed behavior here. After that PR, the alignment of the crate that contains the function is used, and this gets embedded in the crate metadata as part of the function attributes. Before that PR, the alignment of the crate that codegen'd the function was used.

@traviscross
Copy link
Contributor

We talked about this today in lang triage without specific resolution. We discussed how we may need to see more details here about the use case and the specific ask, and may want to hear from other teams, particularly @rust-lang/opsem and @rust-lang/spec. Some of us have also left individual comments above.

@traviscross traviscross added P-lang-drag-3 Lang team prioritization drag level 3.https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang. and removed P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang labels Jun 25, 2025
@RalfJung
Copy link
Member

From an opsem perspective, the only thing that's tricky here is what exactly we guarantee when mixing different values of this flag in crates that are linked together and form one AM. IMO the "obvious" answer is to use the flag set for the crate that contains the function is used (so, the flag indeed acts basically like a crate-level attribute), and I think that is the current implementation, but we have no tests for this case. Also, @tmandry expressed that they would expect a different semantics, though I don't know how to even phrase that one in AM terms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-attributes Area: Attributes (`#[…]`, `#![…]`) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-lang-nominated Nominated for discussion during a lang team meeting. needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. P-lang-drag-3 Lang team prioritization drag level 3.https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang. S-waiting-on-documentation Status: Waiting on approved PRs to documentation before merging S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team
Projects
None yet
Development

Successfully merging this pull request may close these issues.